AITopics | similarity kernel

Collaborating Authors

similarity kernel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Appendices A Kernel methods, HSIC and pHSIC

Neural Information Processing SystemsFeb-8-2026, 10:36:42 GMT

In this case the neuron needs to memorize the last significant deviation from the background (i.e., it needs to remember z

artificial intelligence, divisive normalization, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Review for NeurIPS paper: Convergence and Stability of Graph Convolutional Networks on Large Random Graphs

Neural Information Processing SystemsFeb-8-2025, 07:20:26 GMT

Summary and Contributions: This paper presents theoretical analysis of convergence and stability properties of GCNs on large random graphs. It introduces continuous GCNs (c-GCN) that act on a bounded, piecewise-Lipschitz function of unobserved latent node variables which are linked through a similarity kernel. It has two main contributions. Firstly, it studies notions of invariance and equivariance to isomorphism of random graph models, and give convergence results of discrete GCNs to c-GCNs for large graphs. Specifically, for the invariant case the authors claim that the output of both networks lie in the same output space.

artificial intelligence, convergence and stability, graph convolutional network, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.90)

Add feedback

Reviews: Human-in-the-Loop Interpretability Prior

Neural Information Processing SystemsOct-7-2024, 04:45:56 GMT

More than that, it depends on the purpose for which an explanation is being desired. Assessing whether a model is fit-for-purpose would entail defining a specific task, which as you state is not something you do in this paper. Nevertheless I think it's an important part of framing the general problem.

interpretability, proxy, similarity kernel, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

Zero Shot Molecular Generation via Similarity Kernels

Elijošius, Rokas, Zills, Fabian, Batatia, Ilyes, Norwood, Sam Walton, Kovács, Dávid Péter, Holm, Christian, Csányi, Gábor

arXiv.org Artificial IntelligenceFeb-13-2024

Gaussian, an approach known as denoising score matching [10-12]. In the context of molecule generation, the score is The combinatorial scaling of the available chemical closely related to atomic forces. Consider training data space with molecule size is one of the main challenges that comprise configurations sampled using molecular in the design of new molecules and materials. Generative dynamics or other methods from an underlying Boltzmann modelling aims to solve this by directly proposing distribution, x exp ( βU(x)) /Z. Here, x = structures with desirable properties, without exhaustively {r, z} is a set that represents a molecule, with r the enumerating and screening candidates. Recently, atomic positions and z the chemical elements, U(x) the diffusion-based models have achieved impressive results potential energy, β the inverse temperature, and Z the in molecular docking [1] and generation of linkers [2], partition function. In this case, when the elements z drug-like molecules [3, 4] and crystal structures [5, 6]. are fixed, the score of the data distribution s(x, 0) corresponds Diffusion models are trained to reverse a stochastic to the atomic force (defined as the negative gradient noising process, which gradually corrupts samples of of the potential energy) up to a multiplicative constant: training data until they are indistinguishable from samples drawn from an uninformative prior distribution, such as a standard Gaussian [7-9].

arxiv, atom, molecule, (16 more...)

arXiv.org Artificial Intelligence

2402.08708

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.05)
North America > United States > Colorado > El Paso County > Colorado Springs (0.04)
Europe > Denmark > Capital Region > Kongens Lyngby (0.04)

Genre: Research Report (0.64)

Industry:

Energy (0.93)
Materials > Chemicals (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)

Add feedback

SBSM-Pro: Support Bio-sequence Machine for Proteins

Wang, Yizheng, Zhai, Yixiao, Ding, Yijie, Zou, Quan

arXiv.org Artificial IntelligenceNov-4-2023

Bio-sequences, which include DNA, RNA, and proteins, are the molecular foundation of modern genetic research. The classification of bio-sequences based on sequence information has been a key focus in bioinformatics research. At present, with the sequential completion of genome mapping from humans to various species, we have amassed a vast amount of sequence data, creating an urgent need for computer-assisted annotation of sequence functions. Although it is statistically evident that genetic sequences determine hereditary diseases, the mechanisms by which sequence variations contribute to diseases are intricately complex. It is difficult to address and interpret all these issues through one biological experiment; hence, multiple computer predictions are needed to guide the progression of wet lab exploration. In summary, the application of information science and machine learning to bio-sequence classification is a valuable tool for assisting researchers in comprehending and analysing bio-sequences. It serves as a key driving force for advancing research in the field of bioinformatics. In the field of bio-sequence classification, machine learning methods are broadly pursued using two strategies: feature extraction combined with traditional classification methods and direct sequence classification via deep learning techniques. For bio-sequences, relevant features are mainly characterized as frequency, physicochemical, structural, and evolutionary features.

dataset, kernel, sequence, (14 more...)

arXiv.org Artificial Intelligence

2308.10275

Country:

Asia > China > Zhejiang Province (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education > Health & Safety > School Nutrition (0.51)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

t-METASET: Tailoring Property Bias of Large-Scale Metamaterial Datasets through Active Learning

Lee, Doksoo, Chan, Yu-Chin, Chen, Wei Wayne, Wang, Liwei, van Beek, Anton, Chen, Wei

arXiv.org Artificial IntelligenceAug-18-2022

Inspired by the recent achievements of machine learning in diverse domains, data-driven metamaterials design has emerged as a compelling paradigm that can unlock the potential of multiscale architectures. The model-centric research trend, however, lacks principled frameworks dedicated to data acquisition, whose quality propagates into the downstream tasks. Often built by naive space-filling design in shape descriptor space, metamaterial datasets suffer from property distributions that are either highly imbalanced or at odds with design tasks of interest. To this end, we present t-METASET: an active-learning-based data acquisition framework aiming to guide both diverse and task-aware data generation. Distinctly, we seek a solution to a commonplace yet frequently overlooked scenario at early stages of data-driven design of metamaterials: when a massive (~O(10^4 )) shape-only library has been prepared with no properties evaluated. The key idea is to harness a data-driven shape descriptor learned from generative models, fit a sparse regressor as a start-up agent, and leverage metrics related to diversity to drive data acquisition to areas that help designers fulfill design goals. We validate the proposed framework in three deployment cases, which encompass general use, task-specific use, and tailorable use. Two large-scale mechanical metamaterial datasets are used to demonstrate the efficacy. Applicable to general image-based design representations, t-METASET could boost future advancements in data-driven design.

dataset, descriptor, diversity, (15 more...)

arXiv.org Artificial Intelligence

2202.10565

Country:

North America > United States > Illinois > Cook County > Evanston (0.04)
Asia > China > Shanghai > Shanghai (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > Ireland (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

Multi-Time Attention Networks for Irregularly Sampled Time Series

Shukla, Satya Narayan, Marlin, Benjamin M.

arXiv.org Artificial IntelligenceJan-25-2021

Irregular sampling occurs in many time series modeling applications where it presents a significant challenge to standard deep learning models. This work is motivated by the analysis of physiological time series data in electronic health records, which are sparse, irregularly sampled, and multivariate. In this paper, we propose a new deep learning framework for this setting that we call Multi-Time Attention Networks. Multi-Time Attention Networks learn an embedding of continuous time values and use an attention mechanism to produce a fixed-length representation of a time series containing a variable number of observations. We investigate the performance of our framework on interpolation and classification tasks using multiple datasets. Our results show that our approach performs as well or better than a range of baseline and recently proposed models while offering significantly faster training times than current state-of-the-art methods. Irregularly sampled time series occur in applications including healthcare, climate science, ecology, astronomy, biology and others. It is well understood that irregular sampling poses a significant challenge to machine learning models, which typically assume fully-observed, fixed-size feature representations (Yadav et al., 2018). While recurrent neural networks (RNNs) have been widely used to model such data because of their ability to handle variable length sequences, basic RNNs assume regular spacing between observation times as well as alignment of the time points where observations occur for different variables (i.e., fully-observed vectors). In practice, both of these assumptions can fail to hold for real-world sparse and irregularly observed time series.

irregularly, time point, time sery, (16 more...)

arXiv.org Artificial Intelligence

2101.10318

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Asia > Middle East > Israel (0.04)

Genre:

Research Report > New Finding (0.54)
Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Health Care Technology > Medical Record (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

METASET: Exploring Shape and Property Spaces for Data-Driven Metamaterials Design

Chan, Yu-Chin, Ahmed, Faez, Wang, Liwei, Chen, Wei

arXiv.org Machine LearningSep-15-2020

Data-driven design of mechanical metamaterials is an increasingly popular method to combat costly physical simulations and immense, often intractable, geometrical design spaces. Using a precomputed dataset of unit cells, a multiscale structure can be quickly filled via combinatorial search algorithms, and machine learning models can be trained to accelerate the process. However, the dependence on data induces a unique challenge: An imbalanced dataset containing more of certain shapes or physical properties can be detrimental to the efficacy of data-driven approaches. In answer, we posit that a smaller yet diverse set of unit cells leads to scalable search and unbiased learning. To select such subsets, we propose METASET, a methodology that 1) uses similarity metrics and positive semi-definite kernels to jointly measure the closeness of unit cells in both shape and property spaces, and 2) incorporates Determinantal Point Processes for efficient subset selection. Moreover, METASET allows the trade-off between shape and property diversity so that subsets can be tuned for various applications. Through the design of 2D metamaterials with target displacement profiles, we demonstrate that smaller, diverse subsets can indeed improve the search process as well as structural performance. By eliminating inherent overlaps in a dataset of 3D unit cells created with symmetry rules, we also illustrate that our flexible method can distill unique subsets regardless of the metric employed. Our diverse subsets are provided publicly for use by any designer.

artificial intelligence, machine learning, optimization problem, (18 more...)

arXiv.org Machine Learning

doi: 10.1115/1.4048629

2006.02142

Country:

North America > United States > Illinois > Cook County > Evanston (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
(2 more...)

Add feedback

Similarity Kernel and Clustering via Random Projection Forests

Yan, Donghui, Gu, Songxiang, Xu, Ying, Qin, Zhiwei

arXiv.org Machine LearningAug-27-2019

Similarity plays a fundamental role in many areas, including data mining, machine learning, statistics and various applied domains. Inspired by the success of ensemble methods and the flexibility of trees, we propose to learn a similarity kernel called rpf-kernel through random projection forests (rpForests). Our theoretical analysis reveals a highly desirable property of rpf-kernel: far-away (dissimilar) points have a low similarity value while nearby (similar) points would have a high similarity}, and the similarities have a native interpretation as the probability of points remaining in the same leaf nodes during the growth of rpForests. The learned rpf-kernel leads to an effective clustering algorithm--rpfCluster. On a wide variety of real and benchmark datasets, rpfCluster compares favorably to K-means clustering, spectral clustering and a state-of-the-art clustering ensemble algorithm--Cluster Forests. Our approach is simple to implement and readily adapt to the geometry of the underlying data. Given its desirable theoretical property and competitive empirical performance when applied to clustering, we expect rpf-kernel to be applicable to many problems of an unsupervised nature or as a regularizer in some supervised or weakly supervised settings.

data mining, machine learning, similarity kernel, (18 more...)

arXiv.org Machine Learning

1908.10506

Country:

North America > United States > Massachusetts (0.46)
North America > United States > California (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.89)

Add feedback